In [1]:
from msdas import *
%pylab inline
reload(annotations)
Out[1]:
When reading an input file, the Entry and Entry_name may not be set at all. Besides, full sequence, go terms are not necesseraly provided. We retrieve uniprot entry names and all annotations within the annotations module
In [2]:
filename = yeast.get_yeast_filenames()[0]
r = readers.MassSpecReader(filename)
Right now, this dataframe/MassSpecReader contains the data and some metadata but no information such as UniProt entry. Besides, GO terms and uniprot intact information could be retrieved from UniProt. The annotations module provides tools to automatically fetch this kind of information.
The input can be a filename or an existing MassSpecReader
In [3]:
a = annotations.Annotations(r, "YEAST", verbose=True)
In [4]:
a.annotations #empty for now
In [5]:
a._mapping # empty for now
Out[5]:
In [6]:
a.get_uniprot_entries() # need a network connection. May take some seconds
In [7]:
a._mapping
Out[7]:
In [8]:
a.df[['Protein', 'Psite', 'Entry']].ix[0:10]
Out[8]:
In [8]:
In [9]:
a.set_annotations()
In [10]:
a.df[['Protein', 'Psite', 'Entry']].ix[0:10]
Out[10]: